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1.  Introduction  and  Summary 

In  some  industrial  processes,  ordinarily  observations  arise  from  some 
distribution  function  Fj(x;0),  say,  but  occasionally,  the  process  yields 
"outliers"  which  may  follow  some  other  distribution  F2(x;cp).  Accordingly, 
if  outliers  carry  no  obvious  label,  then  the  process  produces  observations 
according  to  a mixture  distribution  aFj(x;6)  + (l-aiF^ Cx;cp)  for  some 
proportion  o,  0 < a<l.  Also,  in  marine  biology,  one  may  be  interested  in 
studying  certain  characteristics  of  a fish.  For  this  purpose,  samples  of 


fish  are  taken  and  the  desired  trait  is 


jd  for  each  fish.  Since  many 


characteristics  vary  according  to  the  age  of  fish,  the  trait  has  a distinct 
distribution  for  each  age  group  and  the  population  has  a mixture  of 
distributions.  On  the  other  hand,  mixtures  of  distributions  occur  in  the 
compound  decision  problems  as  proposed  by  Robbins  [8] , in  which  mixing 
distributions  correspond  to  some  a priori  distributions. 

It  happens  that  in  many  cases,  an  experimenter  is  faced  with  a problem 
of  choosing  one  or  more  "desirable"  prc cesses  (treat ntntsi  from  among  k 
given  processes  (treatments)  which  produce  observations  according  to  some 
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mixture  distributions.  For  his  special  purpose,  the  experimenter  may 
need  one  or  more  processes  which  are  associated  with  the  largest  (smallest) 
proportion  in  the  mixtures  of  distributions.  For  instance,  he  may  need 
the  process  which  has  the  least  proportion. of  occurence  of  outliers. 

The  problem  for  the  estimation  of  these  proportions  of  mixtures  is 

not  easy.  For  example,  when  Fj(x;6)  and  F2(x;cp)  both  are  normal  with  a 
2 

common  variance  a and,  with  means  and  respectively,  if  Pj^  and 
are  not  well  separated,  i.e.  when  d = 1^1  “ small,  it  is 

almost  impossible  to  classify  the  observations  from  the  mixture  distri- 
bution into  two  groups.  To  be  more  precise,  let  I(n;  denote  the 

Fisher  information  for  the  estimation  of  a.  Hill  [ 5 ] pointed  out  that 
for  d small,  I(a;Fj,F2)  d^.  Therefore,  if  d is  in  (1/8, 1/4),  then 
for  the  maximum  likelihood  estimate  for  a with  standard  deviation  0.1, 
the  sample  size  needed  is  large,  as  big  as  6400.  However,  so  far  the 
classical  efficient  method  for  the  estimation  of  o is  the  maximum  likelihood 
estimate.  The  usual  moment  estimate  is  inefficient.  However,  when  the 
number  of  components  of  mixture  increases,  situations  become  more  complicated 
even  for  the  maximum  likelihood  estimates.  This  suggests  that  another 
approach  should  be  considered.  The  so-called  minimuni  distance  method  of 
Wolfowitz  [12]  seems  reasonable.  Large  sample  properties  like  consistency 
can  be  shown  to  hold.  If  the  distance  between  two  distribution  functions 
is  properly  chosen,  some  other  optimal  properties  may  hold.  And,  if  the 
rate  of  convergence  is  fair,  then  this  approach  should  be  right.  In  the 
problems  of  selection  and  ranking,  some  statistics  are  necessary  such  that 
based  on  these  quantities,  the  criterion  of  priority  of  selection  can  be 
constructed.  Though  these  statistics  may  not  necessarily  be  good  estimates 


I 
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for  the  unknown  parameters  which  are  under  consideration,  most  of  them 
may  do  well  for  the  selection  problem.  Accordingly,  the  minimum  distance 
method  seems  natural  to  be  one  of  the  approaches  to  follow  for  the 
problems  of  selecting  the  largest  (or  smallest)  proportion  of  certain 
component  of  mixture.  The  so-called  least  squares  method  will  be 
applied  in  this  paper  for  the  selection  problem. 

In  section  2 some  notation  are  defined  and  the  problem  is  formulated. 

A class  of  consistent  selection  procedures  are  defined  in  section  3 and 
some  asymptotic  optimal  properties  are  shown. 

2.  Notation  and  Formulation  of  the  Problem 

The  problem  of  identifiability  should  be  mentioned,  since  the  selection 
problem  for  the  proportion  of  mixtures  is  related  to  the  identifiability 
problem.  This  can  be  simply  illustrated  by  an  example.  Let  B(n;p) 
denote  a binomial  distribution  with  success  probability  p,  then,  it  can 
be  found  some  a^,  a^,  62  and  some  p^^,  P2  and  p^  such  that  ^ ®1 

a^B(n;pj)  + 02B(n;p2)  + (l-a^-a2)B(n;p2)  and  6^B(n;p^')  + 62B(n;p2)  + 
(l-6j-62)B(n;p2)  represent  the  same  mixture  distribution  if  n < 5.  In 
this  example,  it  is  impossible  to  identify  and  select.  Necessary  and 
sufficient  conditions  for  identifiability  of  finite  mixtures  can  be 
found  in  [ll]  and  I13]. 

Let  3 denote  the  family  of  distributions  such  that  the  associated 
convex  hull  of  3 is  identifiable.  Many  well-known  families  of  distri- 
butions are  included  in  3.  For  example  (see  [11],  [l3]),  5 can  be  family 
of  p-variate  normal  distributions,  product  of  n exponential  distributions, 
binomial  distributions  with  different  integral  parameters,  translation 


parameter  family  induced  by  a certain  univariate  cdf,  union  of  the  families 
of  product  of  n exponential  distributions  and  the  p-variate  normal 
distribution  etc. 

For  convenience,  for  some  prefixed  integer  m,  we  define 


(2.1) 


<0,1>”  = {(a, ,a,, . . . ,a  ) : a.  > 0,  I a.  = 1}  (m  > 2) . 
A i ni  1 


Let  X be  a real-valued  continuous  function  on  <0,1>™.  Let  the  functions 

Fl(*l®l)»  3* » where  0^  may  be  a parsuneter  vector 

and  F^(x;0^)  and  Fj(x;e^)  may  have  different  parametric  form,  for 

instance,  Fj^(x;0^)  may  be  a normal  distribution  with  location-scale 
2 

parameter  and  F^(x;0^)  may  be  an  exponential  distribution  with 

location-scale  parameter  (Oj.Bj).  For  convenience,  we  denote 


(2.2) 


F = (F,(x;0  ),..., F^(x;0J} 
- 1 1 mm 


(2.3) 


«•  = (a. , ,a. . ,a.  ) , 
-1  il  i2  im-^ 


A finite  mixture  distribution  with  m component  is  defined  to  be  the 
inner  product  of  certain  a g <0,1>"*  and  F,  i.e. 


(2.4) 


G(x;a)  = o • F 


= i a.F.(x;0.) 

1 1 

1=1 

Let  IT  ir„,...,Tr,  be  k populations  such  that  tt.  has  cdf  G(x;a.)  (defined 
1 ^ K 1 -.1 

by  (2.4))  for  some  unknown  parameter  a.  € <0,1>'".  Let  X. , ,X. . ,X.  be  n 

-1  11  lii  im 
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random  observations  from*^,  i=l,2,...,k.  Let  denote  the 

associated  empirical  distribution  function.  Let  ^ • 

_<  Xj-j^^(a)  denote  the  order  values  of  X(a^),  XCa^) , . • . .XCaj^) . 

Based  on  n independent  observations  from  each  population,  we  are 
interested  in  selecting  t (1  t ^ k - 1)  populations,  say,  ir  , tt 

ri 

such  that  X(a  ),  X(a  X(a  ) are  the  t largest.  We  call  these 

'^2  ~^t 

populations  the  t best. 

We  approach  the  problem  by  the  indifference  zone  formulation.  For 
convenience,  we  introduce  the  following  notation. 

For  given  A,  we  define 

(2.5)  R(X;A)  = { (a^,a2»  • • • € <0,1>"’,  + 1 ^[k-t]^“^  * 

For  specified  F and  X,  we  consider  our  problem  on  the  configuration 
R(X;A)  for  given  A for  the  indifference  zone  approach.  We  also  define 


(2.6) 


^ in  in 

R = <0,1>  X <0,1>  X ...  x<0,l>  . (k  copies) 


Finally,  we  define,  for  given  p,  0 < p < 1 


(2.7) 


S(a;H)  = / (a*F-G^(x))^dH(x) 


where  a • F is  a mixture  distribution  for  a g <0,1>  and  G (x)  is  the 
empirical  distribution  associated  with  some  • F for  unknown  6 <0,1>'' 
And  H(x)  is  a cdf.  Hence,  S(a;H)  is  a function  on  a 6 <0,1>"’. 


3.  A class  of  consistent  selection  procedures 


In  this  section,  we  consider  the  cases  when  F are  continuous  and 
discrete.  In  each  case,  we  assume  the  component  FjfXj;0.)of  F arc  completely  known. 
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(A)  Continuous  case 

We  assume  the  parametric  form  of  each  component  F^(x;ep  of  F 
is  continuous  in  x for  each  0^  and  continuous  in  0^  for  each  x. 

For  given  n observations  from  a population  with  cdf  G(x;aQ)  = 

Oq  . F for  some  unknown  and  a given  cdf  H(x),  a vector  a g <0,1  >*" 
at  which  S(a;H)  attains  its  infimum  seems  a "good"  estimate  for  the 
real  in  the  sense  of  least  squares  method.  It  is  to  be  noted  that 
« is  a statistic  of  n observations  and  also  is  a function  of  F and  H. 

A good  choice  in  some  sense  for  the  weight  function  H(x)  is  not  easy. 

Bartlett  and  Macdonald  [1]  study  some  special  case  of  m = 2.  For  m ^ 3, 
the  situation  is  complicated. 

Choi  and  Balgren  [3]  consider  the  case  HCx)  = G^^(x)  and  obtain  some 
optimal  properties  like  consistency  and  asymptotical  normality. 

However,  for  the  case  of  small  samples,  Macdonald  [7]  points  out  that, 
using  H(x)  = a . F,  some  Monte  Carlo  results  show  some  improvement  of 
the  Choi  and  Bulgren's  result.  And,  as  a matter  of  fact,  for  H(x)  = 

» • F,  S(u;H)  is  the  von  Mises  statistic  for  the  goodness-of-fit . Let 
Uj  and  U2  denote  two  random  observations  from  the  population  with  cdf 
F(x)  and  and  denote  the  random  observations  from  a population  with 
cdf  G(x).  It  is  known  that  A(F,G)  s p^{  U^v/U^  < = 

1/3  + 1/2  / (F(x)  - G(x))^  where  avb  = max(a,b),  a^b  *■  irin(a,b) 

_oo 

(Lehmann  [6]).  Note  that  A(F,G)  = 0 if,  and  only  if  F h G.  Roughly 
speaking,  taking  F(x)  to  be  « • F and  G(x)  to  be  G^(x),  it  is  significant 
to  consider  H(x)  = (a*F  + G^(x))  for  our  case.  Accordingly,  in  general, 
we  consider  the  case  H(x)  = p a-  F + (l-p)G^(x)  for  0 _<  p ^ 1 , Note  that 


p = 0 yields  the  Choi  and  Bulgren's  case  and  for  p = 1,  we  get  the  Mac- 
donald's case.  For  OUT  notational  convenience,  henceforth,  we  define 

00 

(3.1)  S^(a;p)  = / (a-F  - G.^(x))^d(p  a-F  + (l-p)G.^(x)) 

which  is  obtained  by  taking  H(x)  = p a*  F + (l-p)G^^(x)  where  G^^(x)  is 
the  empirical  distribution  associated  with  the  n random  observations 
from  the  population n ^ . The  existence  of  some  such  that  S^(a;p) 
attains  the  infimum  can  be  shown  by  going  through  the  analogous  arguments 

A 

as  in  [3] . Define  to  be  such  that 


(3.2) 


a 6<0,1> 


For  a given  value  of  p (0  _<  p _<  1),  we  define  a selection  procedure 
ollows . 

independent  observations  from  each  and  construct  the  empirical 

Jistribution  G.  (x) . Compute  a.  = a. (X. , ,X. ., . . . ,X.  ) which  is  defined 
in^  ' il’  i2’  in 

A A A 

by  (3.1)  and  (3.2).  Let  1 •••  denote  the 

A A A 

ordered  values  of  A(a^),  ^(“2)  > • • • • 

A 

R^:  Select  ir.  if,  and  only  if  X(a.)  > X.,  ^ , , (a) 

p 1 ■'  ~i  — [k-t  + 1] 

Use  a mechanism  when  a tie  occurs. 

By  a correct  selection  (CS)  we  mean  a set  of  t populations  asssociated 

A A A 

with  the  t largest  values  of  ^(a^),  ^ (“2)  > • • • is  selected. 

Definition  3.1  A selection  procedure  R is  consistent  with  respect  to 

(5,X)  if  lim  lim  inf  P {CS|R}  = 1 
A-^0  n-H»  a CD(X;a)  “ 

Definition  3.2  A selection  procedure  R is  strongly  asymptotically  monotone 
with  respect  to  (J,X)  if  X(a^^)  < X(oj)  and  for  any  c > (I  implies 
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lim 


sup  P {ir.  is  selected  |r}  - 6 < lim  P {tt.  is  selected  |R}, 


Theorem  5 . 1 For  any  value  of  p,  0 P is  consistent  and  strongly 


asymptotically  monotone  with  respect  to 

Proof:  (a)  We  show  that  for  any  p ( 0 ^ p ^ 1)  and  for  each  i (i=l , 2, . . . ,kl , 
“i  ^ with  probability  one.  We  follow  the  arguments  given  in  [3]  with 

appropriate  modifications.  Now,  by  the  Glivenko-Cantelli  theorem,  for 
£ ^ 0,  a N(g)  such  that,  whenever  n > N(g), 


P<lp  < €)  = Ptpl“i-F  ' 1 <€}  = !. 


Replacing  dF^(x)  by  d(pa^*F  + (l-p)G^^(x))  and  following  the  same 


argument  as  given  in  the  proof  of  Theorem  2 in  [3],  the  result  follows. 


(b)  Consistency  of  R 


Since  A is  continuous  it  is  is  true  that  Afa.l  ->  A (a.)  WPl 

~i  ~i  — 


(i=l , 2, . . . ,k) . Now,  by  the  Egoroff's  theorem,  for  C > 0 and  6 > 0, 

there  exists  N^(6,6),  and  such  that  the  sample  space  is  decomposed 

to  be  A.  U with  B^the  complement  of  A^  and  P(B^)  > l-£  and  on  B., 

I A(a.)-A(a^) 1 < < whenever  n ^ Nj(€,S)  uniformly  in  € <0,1>  , i.e,  N(g,5)  is 

independent  of  a. . Note  that  A (a.)  depends  on  n.  Set  N = N (£,5)  +...+ 

-1  k 1 

N.  (6><5)  and  set  B = flB,.  Then,  P(B)  > 1 - C,  and  on  B,  whenever  n > N, 

. i=li 

max  I A (a.)  - A(a.)|  <6  uniformly  for  each  («•,  ,a,, . . . ,a.  ) £ Q (defined 
l_<i;<k  ' ' 

by  (2.6)).  Now,  for  any  given  P*  £ (0,1),  and  for  given  A > 0,  however 


small,  choose  6 = j > 0 and  g = 1 - P*.  Since  on  fl(A;A),  A 


Ajk  ^ A=  36.  Hence  we  conclude  that 


*’  ^[k-t]^®^’  i=1.2, . . . ,t|  A(a^  )>  ^ 
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for  every  a fi(X;A).  Hence,  we  have  shown  that  for  every  A > 0,  lim 

inf  P {CS|r  } = 1.  Hence  the  consistency  is  shown, 
a - P 


(o')  Suppose  X(a^)  < XCa^). 


(i)  If  X(a^)  and  X(a^)  ^ ^ [k-t+1]  ‘ P*  I 2/3 

and  go  through  the  arguments  given  in  (b) , we  conclude  that  inf  P {tt, 

a 6S2(X;A)  ? J 

is  selected  |r  } > inf  P {CS|r  } > 2/3  whenever  n > = N (A) 

P a 6J2(X;A)  “ P u u 

for  some  N^.  On  the  other  hand,  for  each  n ^ N^,  {tr^  is  selected  |Rp> 

^{Selection  is  not  correct  R J.  Hence  P„^^T.  is  selected  |r  } < 1 - 

p a 1 P — 

P^{CS|Rp}  1/3  V “ € i.e. 


sup  is  selected  |r  1/3  for  each  n . 


(ii)  Suppose  both  X(a.)  ^ad  x(a.)  are  no  larger  than  X ,,  (al  . 

Then,  for  6 > 0 and  by  the  arguments  in  (b1 , there  exists  a subset  of 

sample  space  B and  an  integer  such  that  P{B}  > 1 - j and  for  n ^ 

and  on  B,  max  {la.  - a.I}  < — . Let  E denote  the  event  [it.  is 
l<i<k  '-1  -i'  3 

selected  |R  T.  Then  E=EnB+bn  Hence,  sun  P {E}  < supp  {E  n B} 

a ~ a 

+ sup  P^{E  n B*^}  ^ sup  P {E  n B}  + 
a » a ® 

since  n B*'}  ^ ^ ^ sl(X;A).  Noting  that  for  any 

a g 0{X;A),  P^(E  n B}  = 0 since  on  B,  + ‘ f 

(iii)  If  X(a.)  and  X(a.)  both  are  no  less  than  X r,  , i proof 

^ J [k- t+ 1 J ^ 

is  analogous  to  the  case  of  (ii) . 

The  proof  is  thus  complete. 
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Remark  3.1.  If  are  some  positive  integers  such  that  each 


t^  is  no  larger  than  k - 1.  Let  n(t^,  t^.-.-.t^)  = {(oij.  a2,...,aj^): 

> °^[k-t.  + l]  ^ ~ l,2,...,m}  where  a|j|  denotes  the  j-th  largest 

^ ^ 111  f21 

value  of  the  i-th  component  of  a and  we  denote  a = (a^  \ a \ . 

~1  -2  ’-k  -r  r r ’ 

If  for  each  i we  are  desired  to  select  the  t^  largest  in  the  i-th 

component  simultaneously,  then,  using  the  statistics  {a^,  a2,...,aj^}, 
which  are  defined  by  (3.2),  associated  with  the  i-th  con>ponent,  we 
select  these  populations  which  have  the  t^  largest  values  in  the  i-th 
component  of  (i=l , 2, . , . ,m) . It  can  be  shown  that 

the  simultaneous  selections  are  also  consistent  and  strongly  asymptotically 
monotone  on  the  configuration  Jl(t  , t»,...,t,). 

1.  K 

^ A 

Remark  5.2.  For  m = 2 and  a given  n,  let  a!,  aV  and  L denote  respectively, 
the  least  square  estimates  associated  with  p = 0,  p = 1 a’-.d  some  p(0  < p < 1) 


Then,  it  can  be  obtained 


a!  = z(F2-F^)(Fj-i)/j:(F2-Fp^  aV  = a!  ^ 


“i  " “i  2n  2 


j:cf2-f^)‘ 


where 


^(F2-Fi)  = ^[l]-^[2]-"-l^[n] 

are  the  order  statistics  from  tt^.  As  a convention  we  take  a!  = 0 if 

^ A 

a!  < 0 and  = 1 if  a!  » 1 and  use  the  same  convention  for  other  two  cases. 
It  can  be  seen  that  Oj  is  always  between  aJ  and  aV  for  all  n.  If  Fj  and 
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are  "smooth"  in  some  sense,  we  see  that  joj  - aV|  = 0(n 
for  e > 0* 


Definition  3.3  A selection  procedure  R is  consistent  of  order  0(A(A)) 
(o(A(A))  with  respect  to  (3,X)  if 


lim  inf  P {CS|R}  = 1 (lim  inf  P {CS|r}  = 1) . 
A->-0  a - A-+0  aenC>^;A)  “ 

n=0(A(A))  n=o(A(A)) 


Theorem  3.2 
0 < 6 <1/2. 


For  given  p,  0 _<  p _<  1, 


-2 
1 _2(5 

R is  consistent  of  order  0(A  3 , 

P 


Proof:  We  note  that,  by  the  Glivenko-Cantelli  theorem  that 

sup  ]G.(x)-G.  (x)  + o(l)|  0 WPl  as  n -V  oo  V-»  where  o(13  is  independent 

X 

of  X.  For  any  fixed  i (1  £ i ^ k),  let  S(a^;p3  denote  the  m-1  equations 

for  which  each  equation  is  differentiated  with  respect  to  a..,  j = 1,2,..., 

m-1 

m-1 , where  a.  = (a.,,  a. a.  ,,1-  J a. .3.  Then,  the  first 
-1  il  i2  im-1  ij 

element  of  S(aj^;p3  for  j=l  becomes 


j = l 


^ n m . 

- y F,  (X.  r.,  ;e,3{  y a.  F (X.  ..,;0 
n A,  1 i[lJ  ^ r-i  i[l]  f n 2n' 


iSuplC.(X)-G.«,(x).c,Cl)|i  I 

X J = 1 


where  X...,  < ...  < X.  r , are  order  statistics  from  -n . . Follow  the 
i[jJ  - - i[n]  1 

analogous  arguments  of  the  proof  of  Theorem  4 of  [3],  we  conclude  that 
|«i  - a^l  < 0(n  2 ) for  all  but  finite  n with  prob.  1 where  0 ^ 6 < X. 

Now,  if  we  take  a/2  = 0(n  for  large  n,  we  see  that  n = 0(A^“^^3 


and  for  this  n it  can  be  sure  that  the  selection  is  correct  with 
probability  one  as  A -►  0.  The  proof  is  thus  complete. 

Let  denote  the  arithmetic  mean  of  r independent  estimates  of 


where  r is  some  integer.  This  means  rn  samples  are  drawn  from  each 

population.  And  for  each  subgroups  of  n samples,  we  obtain  an  estimate 

for  the  population  it^.  If  n is  large,  and  t = 1,  we 

propose  the  following  rule  R\ 

R':  Select  rr.  if  a > a.,  for  all  j ^ i. 
p 1 -il  - ,jl  ■’ 

where  is  the  first  component  of  a^. 


Theorem  3.3  If  n is  large,  t = 1,  and  X(a^)  = the  projection 

function,  then  we  have 


inf  P {CS|r'}  > / n 4>(6.z  + — ■ -)d<l>(z) 

aefi(X;A)  ? P -CO  j=2  J °[j] 


where  ♦(x)  denotes  the  standard  normal  distribution  and 


0=2/  / G (x)[l-G  (y)]dB  (x)dB  (y) 

_00<)^<y%tfoo  ^ J J J 


where 


B^(x)  = F^(x;epG.(x)  - / Cx;6j)dG.  (x) 
for  j=l,2,...,k. 


and  < 0^2]  1 •••  i<^[k]*  = “[l]/'^[j]- 

A 

Proof:  It  has  been  shown  in  [2]  that  is  asymptotically  normal  and 

hence,  the  first  component  of  a^,  say,  is  asymptotically  normal  with 
mean  a and  variance  o?  = 2 / / G. (x) [1-G. (y)]dB. (xjdB. (y) 

t _cD<x<y<oo  t 111 


where 
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B.(x)  = F^(x;0j)G.(x)  - / (x;6 j)dG. (x) 


Hence,  when  n is  large,  t = 1,  we  have  for  o6fi(A;A) 


P^{GS|R'}  « > a j = l,2, . . . ,k-l  |a  =max  a } 

a{  a,  - a n ^ 


a . a, 
J k 


a . 


P {Z,  > Z.(-2-)  - 

/M  L« 1 ' rr  ' 


/r  A 


-•a-k-  jV  "k 


j = l,2,...,k}  (where  , Z2, . . . , Zj^  are 
iid  standard  normal) 


00  O / 

= / n Z + -^)dnz) 
j=i  j j 


k-l 


> / n 4>  (6 . z + 
-*  j=l  ^ 


A 


~)d'i(z)  (by  a lenma  in  [4]) 


where  6.  - 1 ^ [2]  - ^ ^[k] 

This  completes  the  proof. 


Asymptotic  relative  efficiency  of  with  respect  to  a procedure 

We  assume  m=2,  t=l,  and  A is  a projection  function.  In  this  case 

we  have  G^^(x)  = ot^F^(x;6^)  + (l-a^)F2(x;02^  i=l , 2, . . . ,k . and  we  denote 

instead  of  ot^.  Suppose  Fj(x;0j)  and  p2(x;02)  are  not  specified, 

however,  we  assume  there  exists  some  point  x^,  known,  such  that  ^ 

F2(x;02).  Assume  Fj(Xq;0^)  > F2(^q;02^'  “i  “j 

if,  and  only  if  G^(x^)  > G^ (x^) . Hence,  selecting  best  is  equivalent 

to  selecting  the  population  associated  with  the  largest  G(x^;a^) 


value. 
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For  a given  i,  1 ^ i ^ k,  and  j,  1 ^ j £ n,  define 


1 if  X. . < X 

ij  - 0 

0 otherwise 


and  define 


G. (X  ) _ ^ Y. . . 

1 o xj 

Then,  it  is  obvious  that  G^(X^)  is  binomial  random  variable  with  cdf 
B(n;G(x^)). 

We  define  a selection  procedure  R_  as  follows: 

D 

R„:  Select  the  population  it.  which  is  associated  with  the 

largest  G^(x^). 

When  n is  large,  we  use  the  normal  approximation.  Let  F^(X^;6^)  - 

= d^  > 0.  Then,  by  the  result  of  [10],  we  have,  asymptotically 

n % c^(p*)  (l-A^d^)/2A^d^,  when  A ->■  0,  and  p*  Again,  by 

1 - ^ 

the  Feller's  inequality,  we  see  that  itCz)  ^ 1 - ■ ■■■  e . We  obtain 

2 1 2 /2F"  z 

thus  C (p*)  = . Let  n^  and  n^  denote,  respectively,  the  sample  sizes  associated 

with  R and  R when  inf  p {CS}=»p*  is  satisfied  for  both  rules.  We  define  the  asymp- 


“ aei(X;A)  - 

totic  relative  efficiency  of  R with  respect  to  R_  by  ARE(R  ;R„)= 


n^(p*,A) 


■p’  ,l\) 


as  p*  1 


and  then  A -»■  0.  It  follows  from  the  previous  result  and  Theorem  3.2.  We  have 

-2 

CA 1-26 

ARE(R  ;R„)  = lim  Urn  = 0. 


ARE(R  ;R|.)  = lim  lim 
P A^O  P*^l 


2 2 2 
2dQA^(l-P*)^ 


However,  if  we  take  1-P*  = a = A ->■  0,  we  have  our  another  kind  of  effieiency 
given  by 
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ARE(R  ;R„)  = lim 
P ® 


0 


for  0 < 6 < 1/2. 


This  shows  that  R^  is  good  compared  to  Rg.  Also  R^  holds  for  any  general 
m and  t.  We  should  note  that  the  case  m = 2 and  m > 2 are  quite  different 
and  R_  is  useful  only  for  m = 2. 

D 


(B)  Discrete  Case 

In  this  case,  we  denote  F, , F_,...,F  as  discrete  distribution  such 

X 2 m 

that  the  outcomes  from  each  distribution  with  cdf  F^,  for  some  i,  can  be 
classified  into  s 2)  states.  Let  the  probability  that  an  outcome  from 
belongs  to  state  t be  denoted  by  p^j^.  We  assume  F^,  F2,...,F^  are  all 
specified  and  are  all  given. 

For  € <0,1>"'  we  define  a mixture  distribution  by 


Then,  Gj^(x)  is  also  a discrete  distribution  such  that  the  probability  of 
an  outcome  belonging  to  state  j is  given  by 


g..  =a.,p, . +a. -P- . 

iKlj  i2*^2j 


I . + a . p . , 


for  j=l,2, . . . ,s. 


We  assume  there  exists  a lower  bound  g such  that  g. . > g >0  for  all 
i=l,2,...,k,  j=l,2,...,s.  Let  n samples  be  drawn  from  and  let  n^ 
denote  the  number  of  outcomes  which  belong  to  state  j.  For  any  a =/ 
Ca^,...,o^),  we  define  the  Matusita  distance  (see  [8])  as  follows. 


(3.3) 


T*  in 

where  g.  = i a.p  . S.(a)  is  thus  a function  on  <0,1>  . 


i=l 
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Let  a.  denote  a value  in  <0,1>™  such  that  S.lo.)  attains  its  infimum, 

i.e.  et  o.  be  such  that 
-1 

(3.4)  S. (5.)  = inf  „ S. (a). 

^ a €<0,1>  ^ ■■ 

For  given  n and  X,  to  select  the  t best  with  respect  to  X,  we  propose  the 
following  selection  procedure. 

R:  Select  , tt  , . . . , tt  if,  and  only  if, 

’^l  ^2  t 

^(“  )»  ) are  the  t largest  values  of  X(o  ), 

'^1  '^2  ■’^t 

X(a^) , . . . ,X(ttj^)  which  are  defined  by  (3.3)  and  (3.4).  If  there  are 
ties,  use  a random  mechanism. 

Theorem  3.4  The  selection  procedure  R is  consistent  and  strongly 
asymptotically  monotone,  with  respect  to  (3,X). 

Proof:  It  has  been  shown  in  Matusita  [8]  that  for  our  case  -*■  a.  with 

probability  one  in  the  usual  sense  of  convergence  of  a sequence  of 
vectors.  Therefore,  X(a^)  X(a^)  WPl  for  X is  continuous.  Using  the 

analogous  arguments  given  in  the  proofs  of  Theorem  3.1,  we  can  conclude 
the  same  results.  This  completes  the  proof. 
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